large-scale channel gain
Interference-Limited Ultra-Reliable and Low-Latency Communications: Graph Neural Networks or Stochastic Geometry?
Liu, Yuhong, She, Changyang, Zhong, Yi, Hardjawana, Wibowo, Zheng, Fu-Chun, Vucetic, Branka
In this paper, we aim to improve the Quality-of-Service (QoS) of Ultra-Reliability and Low-Latency Communications (URLLC) in interference-limited wireless networks. To obtain time diversity within the channel coherence time, we first put forward a random repetition scheme that randomizes the interference power. Then, we optimize the number of reserved slots and the number of repetitions for each packet to minimize the QoS violation probability, defined as the percentage of users that cannot achieve URLLC. We build a cascaded Random Edge Graph Neural Network (REGNN) to represent the repetition scheme and develop a model-free unsupervised learning method to train it. We analyze the QoS violation probability using stochastic geometry in a symmetric scenario and apply a modelbased Exhaustive Search (ES) method to find the optimal solution. Simulation results show that in the symmetric scenario, the QoS violation probabilities achieved by the model-free learning method and the model-based ES method are nearly the same. In more general scenarios, the cascaded REGNN generalizes very well in wireless networks with different scales, network topologies, cell densities, and frequency reuse factors. It outperforms the model-based ES method in the presence of the model mismatch. Yuhong Liu, Changyang She, Wibowo Hardjawana and Branka Vucetic are with School of Electrical and Information Engineering, The University of Sydney, Sydney, Australia. Yi Zhong is with School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, P. R. China.
Accelerating Deep Reinforcement Learning With the Aid of a Partial Model: Power-Efficient Predictive Video Streaming
Liu, Dong, Zhao, Jianyu, Yang, Chenyang, Hanzo, Lajos
Predictive power allocation is conceived for power-efficient video streaming over mobile networks using deep reinforcement learning. The goal is to minimize the accumulated energy consumption over a complete video streaming session for a mobile user under the quality of service constraint that avoids video playback interruptions. To handle the continuous state and action spaces, we resort to deep deterministic policy gradient (DDPG) algorithm for solving the formulated problem. In contrast to previous predictive resource policies that first predict future information with historical data and then optimize the policy based on the predicted information, the proposed policy operates in an online and end-to-end manner. By judiciously designing the action and state that only depend on slowly-varying average channel gains, the signaling overhead between the edge server and the base stations can be reduced, and the dynamics of the system can be learned effortlessly. To improve the robustness of streaming and accelerate learning, we further exploit the partially known dynamics of the system by integrating the concepts of safer layer, post-decision state, and virtual experience into the basic DDPG algorithm. Our simulation results show that the proposed polices converge to the optimal policy derived based on perfect prediction of the future large-scale channel gains and outperforms the first-predictthen-optimize policy in the presence of prediction errors. By harnessing the partially known model of the system dynamics, the convergence speed can be dramatically improved. I. INTRODUCTION Mobile video traffic is expected to account for more than 75% of the global mobile data by 2021, and video-on-demand (VoD) services represent the main contributor [2]. This paper was presented in part at IEEE Globecom 2019 [1]. To avoid video stalling for a user experiencing hostile channel conditions, a base station (BS) can increase its transmit power for ensuring that the video segment is downloaded before being played.
Learning to Optimize with Unsupervised Learning: Training Deep Neural Networks for URLLC
Sun, Chengjian, Yang, Chenyang
Learning the optimized solution as a function of environmental parameters is effective in solving numerical optimization in real time for time-sensitive applications. Existing works of learning to optimize train deep neural networks (DNN) with labels, and the learnt solution are inaccurate, which cannot be employed to ensure the stringent quality of service. In this paper, we propose a framework to learn the latent function with unsupervised deep learning, where the property that the optimal solution should satisfy is used as the "supervision signal" implicitly. The framework is applicable to both functional and variable optimization problems with constraints. We take a variable optimization problem in ultra-reliable and low-latency communications as an example, which demonstrates that the ultra-high reliability can be supported by the DNN without supervision labels.